sirajwikiaorg-20200213-history
Tensorflow
Transcript: Hello world, it's Siraj. The most popular machine learning library in the world right now is Google's Tensorflow. We're going to use it to build a classifier that can look at an image of a handwritten digit and classify what digit it is in under lines of code basic (is she) pretty much every single Google product uses machine learning in some way whether it's image search, image captioning, translation, recommendations. Google needs machine learning to take advantage of their godlike data sets to give users the dopest experience. There are three different crowds that use machine learning researchers, data scientists and wizards; I mean developers. Ideally, they can all use the same tool set to collaborate with each other and improve their efficiency tensorflow was a solution they created to help solve this problem Google doesn't just have a lot of data, they have the world's largest computer. So, the library built to scale it was made to run on multiple CPUs or GPUs and even mobile operating systems and it has several wrappers in several languages. My favorite one is Python. Objective-C, you broke my heart. We have to install tensorflow first we're going to use pip, the Python package manager, to install it. Once we have pip, we can create an environment variable that points to the download URL for tensorflow. Once we set the environment variable, we can download tensorflow via pip install with the upgrade flag in the name of our environment variable. Dope, now that we have our dependencies installed let's get to the code. We'll start off by importing our handwritten digit data set the input data class is a standard Python class. Download the data set, splits it into training and testing data and formats it for our use later on and of course we'll import tensorflow. Now, we can set our hyper parameters or tuning knobs for our model. The first one is the learning rate which defines how fast we want to update our weights if the learning rate is too big our model might skip the optimal solution if it's too small we might need too many iterations to converge on the best results so we'll set it to -- because it's a known decent learning rate for this problem definitely faster than little wayne's now we want to create our model in tensorflow a model is represented as a data flow graph. The graph contains a set of nodes called `operations`. These are units of computation they can be as simple as addition or multiplication and can be complicated at some multivariate equation. Each operation takes in as input a tensor and outputs a tensor as well. A tensor is how data is represented in tensorflow. They are multi-dimensional arrays of numbers and they flow between operations; hence the name tensor flow; it all makes sense. We'll start by building our model by creating two operations; both our placeholder operations. A placeholder is just a variable that we will assign data to at a later date. It's never initialized and contains no data well define the type and shape of our data as the parameters. The input images X will be represented by a d tensor of numbers .. is a dimensionality of a single flattened MNIST image finding an image means converting a p-d array to a q-d array by unstacking the rows and lining them up. Tthis is more efficient formatting. The output class is why will consist of a d tensor as well where each row is a one hot dimensional vector showing which digit class the corresponding MNIST image belongs to. Then we'll define our weights W and biases B for our model. The weights are the probabilities that affect how data flows in the graph and they will be updated continuously during training so that our results get closer and closer to the right solution. The bias lets us shift our regression line to better fit the data well. Then create a named scope. Scopes help us organize nodes in the graph visualizer called tensor board which will view at the end will create three scopes in the first scope we'll implement our model logistic regression by matrix multiplying the input images X by the weight matrix W and adding the bias B well then create summary operations to help us later visualize the distribution of our weights and biases in the second scope will create our cost function the cost function helps us minimize our error during training and we'll use the popular cross-entropy function as it then we'll create a scalar summary to monitor it during training so we can visualize it later. Our last scope is called Train and it will create our optimization function that makes our model improve during training we'll use the popular gradient descent algorithm which takes our learning rate as a parameter for pacing and our cost function as a parameter to help minimize the error now that we have our graph built will initialize all of our variables then we'll merge all of our summaries into a single operator because we are extremely lazy. Now we're ready to launch our graph by initializing a session which lets us execute our data flow graph well. Then set our summary write or folder location which will later load data to visualize in tensor board training time let's set our for loop for our specified number of iterations and initialize our average cost which will print out every so often to make sure our model is improving during training we'll compute our batch size and start training over each example in our training data. Next, we'll fit our model using the batch data in the gradient descent algorithm for back propagation. We'll compute the average loss and write logs for each iteration via the summary writer for each display step we'll display error logs to terminal that's it for training we can then test the model by comparing our model values to our output values. Will calculate the accuracy and print it out for test data. The accuracy gets better with training and once we've trained and tested our model, it'll be able to classify novel MNIST digits pretty well. We can then visualize our graph in tensor board. Yo, pretty colors and stuff. In our browser, we'll be able to view the output of our cost function over time. Under the events tab under histograms, we'll be able to see the variance in our biases and weights over time. Under graphs we can view the actual graph we created as well as the variables for weights and bias we can see the flow of tensors in the form of edges connecting our nodes or operations we can see each of the three scopes we named in our code earlier and by double clicking on each we can see a more detailed view of how tensors are flowing through each. Lots of cool links in the description and please hit that subscribe button, if you want to see more ML videos. For now, I've got to go doc as my environment. So, thanks for watching.